added alternative embedding models for sentence transformers and openai #101

jamescalam · 2023-08-22T04:56:21Z

Addressing #99. Added ability to choose embedding model via the config.yaml. Also added OpenAI API as option (alongside Sentence Transformers). We can do:

models:
  - type: main
    engine: openai
    model: text-davinci-003
  - type: embedding
    engine: openai
    model: text-embedding-ada-002

and

models:
  - type: main
    engine: openai
    model: text-davinci-003
  - type: embedding
    engine: SentenceTransformers
    model: all-MiniLM-L6-v2

I understand there is also some work ongoing in this space, but in a private repo, naturally, I can't see this, so I apologize if any of this conflicts with or overlaps. Let me know if any of this can be of use!

Thanks

drazvan · 2023-08-22T21:40:42Z

Thanks for this @jamescalam! I'd like to merge this in the next few days. Let me check against the internal work, but I think we'll first merge this and then apply the rest of the changes.

A couple points:

After merge, we'll likely change the interface to be async. Otherwise, in a server scenario whenever an embedding is computed using an endpoint, like with OpenAI, everything is blocked.
We should probably add a register_embedding_provider method as well so that a user can register a custom one from their config.

Nice work! 👍

added alternative embedding models for sentence transformers and openai

24cbf6c

This was referenced Aug 22, 2023

added ability to set embedding models and use openai #100

Closed

Not hardcoded embedding models #99

Closed

drazvan self-assigned this Aug 22, 2023

drazvan added the enhancement New feature or request label Aug 22, 2023

drazvan added this to the v0.5.0 milestone Aug 22, 2023

drazvan merged commit 2d0ace1 into NVIDIA:main Sep 1, 2023
1 check failed

krannnn mentioned this pull request Dec 4, 2023

Durable embeddings #200

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added alternative embedding models for sentence transformers and openai #101

added alternative embedding models for sentence transformers and openai #101

jamescalam commented Aug 22, 2023

drazvan commented Aug 22, 2023

added alternative embedding models for sentence transformers and openai #101

added alternative embedding models for sentence transformers and openai #101

Conversation

jamescalam commented Aug 22, 2023

drazvan commented Aug 22, 2023